Regret testing: learning to play Nash equilibrium without knowing you have an opponent
نویسندگان
چکیده
A learning rule is uncoupled if a player does not condition his strategy on the opponent’s payoffs. It is radically uncoupled if a player does not condition his strategy on the opponent’s actions or payoffs. We demonstrate a family of simple, radically uncoupled learning rules whose period-by-period behavior comes arbitrarily close to Nash equilibrium behavior in any finite two-person game.
منابع مشابه
Regret Testing: A Simple Payo¤-Based Procedure for Learning Nash Equilibrium1
A learning rule is uncoupled if a player does not condition his strategy on the opponents payo¤s. It is radically uncoupled if a player does not condition his strategy on the opponents actions or payo¤s. We demonstrate a family of simple, radically uncoupled learning rules whose period-by-period behavior comes arbitrarily close to Nash equilibrium behavior in any nite two-person game. Keywor...
متن کاملUnifying Convergence and No-Regret in Multiagent Learning
We present a new multiagent learning algorithm, RVσ(t), that builds on an earlier version, ReDVaLeR . ReDVaLeR could guarantee (a) convergence to best response against stationary opponents and either (b) constant bounded regret against arbitrary opponents, or (c) convergence to Nash equilibrium policies in self-play. But it makes two strong assumptions: (1) that it can distinguish between self-...
متن کاملEconometrics for Learning Agents ( working paper )
The main goal of this paper is to develop a theory of inference of player valuations from observed data in the generalized second price auction without relying on the Nash equilibrium assumption. Existing work in Economics on inferring agent values from data relies on the assumption that all participant strategies are best responses of the observed play of other players, i.e. they constitute a ...
متن کاملOn No-Regret Learning, Fictitious Play, and Nash Equilibrium
This paper addresses the question what is the outcome of multi-agent learning via no-regret algorithms in repeated games? Speci cally, can the outcome of no-regret learning be characterized by traditional game-theoretic solution concepts, such as Nash equilibrium? The conclusion of this study is that no-regret learning is reminiscent of ctitious play: play converges to Nash equilibrium in domin...
متن کاملBayesian Opponent Exploitation in Imperfect-Information Games
Two fundamental problems in computational game theory are computing a Nash equilibrium and learning to exploit opponents given observations of their play (opponent exploitation). The latter is perhaps even more important than the former: Nash equilibrium does not have a compelling theoretical justification in game classes other than two-player zero-sum, and for all games one can potentially do ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006